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The Balog-Szemeredi-Gowers theorem has a rich history, and is one of the 
most useful tools in additive combinatorics. It began with the a paper by 
Balog and Szemeredi [2] , and then was refined by Gowers [3] to the following 
basic result (actually, Gowers proved somewhat more than we bother to state 
here) : 

Theorem 1 There exists an absolute constant k > such that the following 
holds for all finite subsets X and Y of size n > no of an abelian group: 
Suppose that there are at least Cn 3 solutions to x\ + y\ = x 2 + y-i, £ X 
and yi G Y . Then, X contains a subset X' , of size at least C K n, such that 

\X' + X'\ < C- K n. 

Sudakov, Szemeredi and Vu [5] proved a refinement of this theorem (Balog 
[1] independently obtained a similar result), given as follows: 

Theorem 2 Let n, C, K be positive numbers, and let A and B be two sets 
of n integers. Suppose that there is a bipartite graph G(A, B, E) with at least 
n 2 1 K edges and \A +g B\ < Cn. Then one can find a subset A' C A and 
a subset B' C B such that \A'\ > n/lQK 2 , \B'\ > n/AK and \A' + B'\ < 
2 12 C 3 K 5 n. 

Remark. It is not difficult to show that this theorem, along with some 
lemmas and theorems of Ruzsa (the Ruzsa triangle inequality |6], and the 
Ruzsa-Plunnecke Theorem [4]), implies that we may take k < 20 in Theorem 
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In the same paper, Sudakov, Szemeredi and Vu [5J Theorem 4.3] proved 
the following powerful hypergraph version of the Balog-Szemeredi-Gowers 
Theorem: 

Theorem 3 For any positive integer k, there are polynomials fk{x,y) and 
gk{x,y) with degrees and coefficients depending only on k, such that the fol- 
lowing holds. Let n, C, K be positive numbers. If A\, A k are sets of n pos- 
itive integers, H(A\, ...,Ak,E) is the k -partite, k-uniform hypergraph with at 
least n k /K edges, and \ ® k m= i A| < Cn, then one can find subsets A\ C M 
such that 

• 1-4*1 > n/f k (C, K) for all 1 < i < k; 

• \A[ + --- + A' k \<g k (C,K)n. 

The notation (Bh means that the sum is restricted to the hypergraph H. 

Beautiful and useful as it is, it would be nice if one had some control on 
the degrees of these polynomials / and g. And, for particular applications 
that we (Croot and Borenstein) have in mind, it would be good to be able 
to control the rate of growth of sums A' x + • • • + A' e , where £ is much smaller 
than k - it would be good to be able to bound the size of this sum from 
above by 

C 1+e K dk n, (1) 

where d k depends only on k. Perhaps such a bound can be developed by 
modifying the proof of Sudakov, Szemeredi and Vu; however, in the present 
paper, we take a different tack, and produce an alternate proof of a related 
hypergraph Balog-Szmeredi-Gowers theorem, where such an upper bound as 
(CD) will be implicit, though only for the case where A± = ■ ■ ■ = Af.. In our 
proof, we will use some of the same standard tricks as Sudakov, Szemeredi 
and Vu do in their proof. 

The notation we use to describe this theorem, and its proof, will be some- 
what different from that used by Sudakov, Szemeredi and Vu. Furthermore, 
we will not attempt here to give the most general formulation of the theorem. 

Theorem 4 For every < e < 1/2 and c> 1, there exists 5 > 0, such that 
the following holds for all k sufficiently large, and all sufficiently large finite 
subsets A of an additive abelian group: Suppose that 

S C Ax Ax ■■■ x A = A k , 
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and let 

E(S) : = {ai H ha fc : (ai, a k ) 6 £}. 

\S\ > \A\ k ~ 5 , and |E(5)| < \A\ C , 

then there exists 

A' C A, > l^l 1 ^, 

= \A' + --- + A'\ < \A'\ c(l+£i \ 



1 Proof of Theorem [4 



1.1 Notation and basic assumptions 

It will be advantageous to describe the proof in terms of strings. So, the set 

S C A k 

will be thought of as a collection of strings of length k: 

XiX 2 ■ ■ -Xk, 

where each x,i G A. 

Often, we split these strings up into substrings; for example, the string 

X — X i • • • X 

can be written as a product of a "left substring £ of length k/2" (assume k 
is even) and a "right substring r of length k/2" . So, 

x = Ir. 



We may assume that 

k = 2 n , 

since if this is not the case, then we let k' be the largest power of 2 of size at 
most k, and proceed as follows: Given a string x% • • -Xfc in S, we write it as 
a product where 

4 := Xi • ■ • x fc / and r a . := • • • x k . 



3 



Now, for some string y we will have that r x = y for at least |<S'|/|j4.| fc-fe ' choices 
for x G S. Letting S' denote the set of all strings £ x with r x = y, we will 
have 

\S'\ > \A\ k '- 5 , 

and clearly 

|E(S')| < |E({4y : xeS'})\ < |E(S)| < \A\ C . 

So, we could just assume that our k had this value k' all along (remember, 
we get to choose k to be as large as needed to get the desired conclusion). 

1.2 The suppression of subscripts, and a comment about 
iteration 

In the proof of our theorem, we will iteratively replace our initial set S with 
other, smaller and smaller sets having certain useful properties. If we were 
so inclined, we could describe this iteration by saying that we produce a 
sequence of sets 

S := S, S u S 2 ,..., S t , where $ C A k \ \S t \ > \A\ ki ~ s \ 

The trouble with this is that it leads to a proliferation of subscripts, which 
can be unpleasant. 

Instead of introducing subscripts, we use the "assignment operator" , de- 
noted by 

S <- S', 

which means that the set S gets "reassigned" to the set S'. So, it is worth 
keeping in mind that later into the proof, S refers to a different set than at 
the start of the proof. The same will be true of k and 5. 

1.3 Lengths of iterations and the choice of 5 and k 

At almost every step of our iteration, we will replace S C A k with S' C A k ' , 
satisfying 

\S'\ > A k '~ s , and lAI 1 - ^ < |E(S")| < mS)^ 4000 

Clearly, for 5 > small enough, the number of such iterations we can take 
will be bounded from above in terms of e and c. Furthermore, since at each 
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step, k' is at least half the size of k, so long as the initial value of k is large 
enough in terms of c and e, we will not run out of dimensions. 

Since our theorem is a qualitative result, in that it does not even attempt 
to explain how 5 or k depends on e and c, there is no need to be more precise 
about just how small one needs take 5 or how large to take k, in order for 
our iteration process to terminate and prove our theorem. 

1.4 The iteration part of the argument 

Given a string x of length k/2, we let R x denote the set of all strings y of 
length k/2 such that 

xy e S. 

We analogously define L x . 

We will now select an x, and therefore R X) very carefully, so that it satisfies 
certain useful properties: We begin with the inequality 

Y,\ R *\ = \ s \ ^ \ A \ k ~ s - 

X 

We now apply the following lemma, which is easily proved upon using the 
Cauchy-Schwarz inequality: 

Lemma 1 Suppose that V is a set of n elements, and suppose that 

u u u 2 ,. ..,U r C V 

satisfy 

r 

\ U i\ ^ rnl ~ & - 

i=l 

Then, there exists 1 < j < r such that 

\ U i nU j\ > rn 1 - 25 . 

l<i<r 

From this lemma we easily deduce that there exists x such that 

J2\Rx^R y \ > \A\ k ~ 2S . 
y 
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Next, we let 

S' := {yz E S : z E R x }, (2) 

and we observe that 

\S'\ = J2\ R * nR y\ > \A\ k - 25 ; 
y 

so, S' is not too much smaller than S. 
We now make a reassignment: 

S «- S', 5 <- 25, 

and observe that S now satisfies 

\S\ > \A\ k ~ 5 , 

and we in addition have that every element of S can be expressed as yz, 
where z G R x . 

Now suppose that there is a string y of length k/2 such that if 

\Ry\ > ^l 2 ^, 

then 

If this occurs, then we make another reassignment: 

S <— Ry, k <— k/2, 5 «- 25, 
and we start back at the very beginning of this subsection 11.41 



1.5 The sets H' and H" 

When we come out of the iteration loops ('reassignments') from the previous 
subsection, we finish with a set S having a number of highly useful properties, 
among them: 



\S\ > \A 



k-S. 



Each R y C R x ; and, 
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• If we let H denote those strings h of length k/2 such that 

\R h \ > \A\ k ' 2 - 2 \ 
then for every such h we will have that 

|s(s)| l-e/400c < | E(jRfc) | < | E(S ^ 

One can easily show, using the lower bound for \S\, that for \A\ sufficiently 
large, 

\H\ > \A\ k / 2 - 25 . 

Since 

J2\{heH : hzeS}\ > \H\ ■ \A\ k / 2 - 25 , 

we deduce that there exists z G R x such that there are at least 

\H\ ■ \A\~ 25 > \A\ k/2 ~ 4S 

vectors h G H satisfying 

hz G S. (3) 

Fix one of these z, and let 

H' C H 

denote all those h G H such that ([3]) holds. Note that 

\H'\ > \A\ k/2 ~ 4S . 

Next, let 

H" C if' 

denote those h £ H' such that there are at least 

|#'| • |E(#')r72 (4) 

other /i' G H' satisfying 

Y,{ti) = E(/i). 

We have that 

< IS^OKI^'I • l s (^')rV 2 ) = \H'\/2 
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So, 

\H"\ > \H'\/2 > \A\ k ' 2 ~ u , 

for \A\ sufficiently large. 
We also note that 

| £(#")! < \Z(H')\ = \E({hz : heH'})\ < \E(S)\. 

This is one of the places where it was essential to have that z G Rh for 
heH'. 

Now suppose that, in fact, 

\E(H")\ < \Z(S)\ l ~ £/400c . 

If so, then we assign 

S <- H", k «- k/2, 5 <- 55, 
and we repeat our iteration process again, starting in subsection 11.41 
On the other hand, if © does not hold, then we will have that 
| S ^|i- £ /4ooc < |E(F")| < \E(H')\ < \E(S)\ 

1.6 The final leg of the proof 

From the fact that 

\E({hueS : h G H",u G R h })\ < 
along with the fact that i?^ C i? x and 

|s(5)| l- £ /400 C < < | E(i?a;) | < 

as well as ((7j), we deduce that there are at least 

| S(5) |3-3 £ /400 C 

quadruples 

hi,h 2 G Ti(H"), and tti,?^ G S(_R X ), 



such that 

+ £(ui) = £(/i 2 ) + £(u 2 ). 
Now we apply Theorem [H setting 

X : = £(#"), and K := £(i2 x ). 

Following the comment after Theorem [2l we have that there exists 

£ C £(#"), |£| > ^(i?")! 1 "^, 

such that 

|£ + £| < |£| 1+e/2c . 
Let if'" denote the set of all 

h e H", 

such that 

£(/i) G £. 

By © and 0, we have that 



> 


|E|(|lf'|.|E(iOr72) 




> 


|E(/r")|i-e/2c| ff /| . 


-72 


> 


|E(^)|W2c| S (^//)|-l/(l- 


-e/400c) 1^-/1 /2 


> 


is(ir")r e/c i^'i 




> 


i^|fe/2-45-e 





By simple averaging, there is some vector 

w e A k/2 ~ 1 , 

such that there are at least 

|^|l-45-e 

vectors h G H'" whose last k/2 — 1 coordinates are the vector w. The 
of this is that if we let 

A' := {aeA : aw G H'"}, 
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then 

\A'\ > \A\ l ~ i& ~ £ } (9) 

and 

A' + A' + 2Z(w) C £(#"') + £(#'") = £ + E. 

Now we apply a weak form of the Ruzsa-Plunnecke Theorem [I], given as 
follows: 

Theorem 5 Suppose that X is some finite subset of an additive abelian 
group, such that 

\x + x\ < c\x\. 

Then, we have that 

\kX\ = \X + X + --- + X\ < C k \X\. 

Using 

X := £, andC := |£| e/2c , 
we deduce that for £ even, 

\£A'\ < |£E| < \Y 1 \ 1+el/2c < \A\ c+£i < |y4'|( c+££)/(1_4,5_£) 

By selecting 5 > small enough, relative to e > 0, we can ensure that for 
e < 1/2, 

\£A'\ < \A'\ c ^ 1+2£e \ 

Of course, when 1/2 < e < 1 the inequality is trivial, as c > 1. Clearly, on 
rescaling e appropriately, our theorem is proved. 
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